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ApP‘ot  . 


Newton's  method  for  finding  a  stationary  point  of  f:  Rn  >  R  consists  of 
the  iteration 

x±+i  =  -  [  V2f (^i)]"1  •  Vf  (x±)  . 

Its  main  attraction  is  its  second  order  of  convergence.  However,  it  necessitates 
computation  and  inversion  of  the  second  order  derivatives  matrix. 

Common  minimization  algorithms  approximate  the  Hessian  or  its  inverse  by 
first  order  (i.e.  gradient)  information.  First  order  information  algorithms  in 
common  use,  have  at  best  superlinear  rate  of  convergence  [cf.  2], 

We  present  a  new  class  of  algorithms  which  use  first  order  information  only, 
while  maintaining  quadratic  convergence. 

At  step  i  of  the  algorithm,  we  interpolate  f  by  a  suitable  interpolating 
function  T,  requiring 

[  T^xi-j^  =  f(xi-j) 

(1)  <  j  =  0,l 

[VT<*  )  =  Vffo^)  , 

and  determine  x^  as  a  solution  of  the  equation 

(2)  7T<xi+i>  “  0  * 

We  assume  that  the  interpolating  function  depends  on  some  parameters.  We 
further  assume  that  the  equations  (1)  for  the  parameters  of  T,  and  equation 
(2)  for  have  solutions  for  all  i,  and  that  the  parameters  of  T  depend 

on  the  data  continuously  through  (1).  Finally,  we  assume  that  f  and  T  have 
continuous  derivatives  of  order  5  near  the  solution. 
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2. 


We  derive  the  rate  of  convergence  of  the  algorithm  defined  by  (1)  and  (2; 
by  establishing  a  difference  relation  for  the  errors  e^  =  J|x^-x*||.  Here  ||'|| 
is  an  arbitrary  fixed  norm.  This  difference,  relation  is  analogous  to  the  one 
obtained  in  [1]  for  the  one-dimensional  case. 

To  this  end  we  define  a  function  'I':  R  ■>  Rn  in  terms  of  which  and  the 
functions  f,T  we  can  express  the  errors  of  a  related  one-dimensional  interpo¬ 
lation  problem. 

We  assume  that  a  point  x*  e  Rn  which  is  a  solution  of 
(3)  Vf(x*)  =  0 

exists.  Let  R  >  Rn  be  a  curve  in  Rn  through  the  points  j  =  l,0, -1 

and  x*,  i .  e . , 


(4) 


'Kti„j)  =  xi_j  .  J-i.0,-1  , 

'['(t*)  =  X*  , 


where  the  parameter  t  is  chosen  so  that 

(5)  timj  =  ||xi_j-x*!|  ,  t*  =  |jx*-x*|j  =  0  . 

We  will  later  discuss  the  existence  of  this  construction.  Note,  however, 
that  the  construction  of  is  a  part  of  the  analysis  of  the  properties  of  the 
algorithm,  not  a  part  of  the  algorithm  itself. 

Now  define  9(t)  =  T  ('V  (t) ) ,  4  (t)  =  f  OKt) ) .  Equations  (1)  and  (4)  imply 

0<k)<ti-j)  “  *<k)(ti.j)  j.k-0,1  , 

which  in  turn  implies  [see  3] 


3. 


(6)  4(t)  -  o(t)  =  4^>rg(A)(n>  l  (t-t  )2  . 

j=o 

where  rj  is  some  intermediate  point.  Equation  (j&)  is  the  basic  difference 
relation  we  need  (cf.  [1]).  Differentiating  it  and  setting  t =  0,  we  obtain 


(7) 


h+i '  ViVi 


If  the  sequence  converges  to  a  non-zero  limit,  the  relation  (7)  implies 

that  the  sequence  t^  converges  to  zero  if  t^  ,  t^  are  small  enough,  with  rate 

of  convergence  which  is  given  by  the  unique  positive  root  of  the  indicial  poly- 
2 

nomial  of  (7):  t  -  t  -  2  =  0,  i.e.  quadratically  [cf.  4]. 

In  order  for  the  sequence  to  converge,  it  is  sufficient  that  and 

<f  exist  and  are  continuous,  and  <^"(0)^0.  This  would  be  the  case  if  f  has 
continuous  derivatives  of  order  5  near  the  solution,  the  parameters  of  T  de¬ 
pend  continuously  on  the  data,  and  T  has  continuous  derivatives  of  order  5  for 
the  appropriate  values  of  the  parameters.  Finully,  it  is  evident  that  the  curve 
'if  can  be  chosen  so  that  9"  (0)  =  V  V  f'l'  is  nonzero,  e.g,  by  choosing 

4  3  2 

=  a^t  +  b^t  +  c^t  +  t  +  d^  ,  k  =  l,...,n. 

Note  that  no  line  search  is  needed  in  this  class  of  algorithms,  and  that 
they  may  be  designed  to  locate  saddle  points  rather  than  minimum  points. 

A  useful  choice  for  the  interpolating  function  T  seans  to  be  a  separable 
sum  of  rational  functions  of  the  type  discussed  in  [1]. 

The  results  in  [1]  for  the  one-dimensional  case,  can  clearly  be  extended 
by  the  same  device  to  the  n-dimensional  case.  In  particular,  algorithms  based  on 


4. 


function  values  only,  have  rates  of  convergence  "between  1.3  and  1.6;  the 
rate  of  convergence  is  independent  of  the  interpolating  function,  and  inverse 
interpolation  can  be  utilized  for  minimization.  Similar  results  hold  for  the 
root-finding  problem  discussed  by  Traub  [5],  Details  of  this  work  will  appear 
elsewhere. 
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